(metaindex = harmonic mean of all 3 accuracy metrics)
metaindex method
ElasticNet 0.8188778 ElasticNet
Mystepwise_glm_binomial 0.8182966 Mystepwise_glm_binomial
AUC_MDL 0.8161237 AUC_MDL
miRy
ElasticNet Class ~ hsa.let.7b.5p + hsa.miR.30d.5p + hsa.miR.320b + hsa.miR.19b.3p + hsa.miR.20b.5p + hsa.miR.1304.3p + hsa.miR.139.3p + hsa.miR.375.3p
Mystepwise_glm_binomial Class ~ hsa.miR.19b.3p + hsa.miR.4433a.3p + hsa.let.7b.5p + hsa.miR.106b.5p + hsa.miR.1304.3p + hsa.miR.30d.5p + hsa.miR.320b + hsa.miR.375.3p
AUC_MDL Class ~ hsa.miR.20b.5p + hsa.miR.19b.3p + hsa.let.7b.5p + hsa.miR.320b + hsa.miR.30d.5p + hsa.miR.139.3p + hsa.miR.17.5p + hsa.miR.182.5p + hsa.miR.421 + hsa.miR.375.3p
Performance of those signatures:
$`ElasticNet:AUC_MDL`
[1] "hsa.miR.20b.5p" "hsa.miR.139.3p"
$Mystepwise_glm_binomial
[1] "hsa.miR.4433a.3p" "hsa.miR.106b.5p"
$`ElasticNet:Mystepwise_glm_binomial`
[1] "hsa.miR.1304.3p"
$`ElasticNet:Mystepwise_glm_binomial:AUC_MDL`
[1] "hsa.let.7b.5p" "hsa.miR.30d.5p" "hsa.miR.320b" "hsa.miR.19b.3p" "hsa.miR.375.3p"
$AUC_MDL
[1] "hsa.miR.17.5p" "hsa.miR.182.5p" "hsa.miR.421"
(metaindex = mean of 2 accuracy metrics)
metaindex method
feseR_combineFS_RF_SMOTE 0.7945796 feseR_combineFS_RF_SMOTE
fwrap 0.7928404 fwrap
AUC_MDL 0.7927240 AUC_MDL
miRy
feseR_combineFS_RF_SMOTE Class ~ hsa.miR.320b + hsa.let.7b.5p + hsa.miR.19b.3p + hsa.miR.20b.5p + hsa.miR.30d.5p + hsa.miR.139.3p + hsa.miR.17.5p + hsa.miR.375.3p
fwrap Class ~ hsa.miR.1273h.3p + hsa.miR.182.5p + hsa.miR.20b.5p + hsa.miR.320b
AUC_MDL Class ~ hsa.miR.20b.5p + hsa.miR.19b.3p + hsa.let.7b.5p + hsa.miR.320b + hsa.miR.30d.5p + hsa.miR.139.3p + hsa.miR.17.5p + hsa.miR.182.5p + hsa.miR.421 + hsa.miR.375.3p
Performance of those signatures:
$`fwrap:AUC_MDL`
[1] "hsa.miR.182.5p"
$`feseR_combineFS_RF_SMOTE:AUC_MDL`
[1] "hsa.let.7b.5p" "hsa.miR.19b.3p" "hsa.miR.30d.5p" "hsa.miR.139.3p" "hsa.miR.17.5p" "hsa.miR.375.3p"
$`feseR_combineFS_RF_SMOTE:fwrap:AUC_MDL`
[1] "hsa.miR.320b" "hsa.miR.20b.5p"
$fwrap
[1] "hsa.miR.1273h.3p"
$AUC_MDL
[1] "hsa.miR.421"
(metaindex = mean of sensivitiy and specificity in validation dataset)
metaindex method
fwrap 0.5701070 fwrap
feseR_combineFS_RF_SMOTE 0.5648820 feseR_combineFS_RF_SMOTE
AUC_MDLSMOTE 0.5583661 AUC_MDLSMOTE
miRy
fwrap Class ~ hsa.miR.1273h.3p + hsa.miR.182.5p + hsa.miR.20b.5p + hsa.miR.320b
feseR_combineFS_RF_SMOTE Class ~ hsa.miR.320b + hsa.let.7b.5p + hsa.miR.19b.3p + hsa.miR.20b.5p + hsa.miR.30d.5p + hsa.miR.139.3p + hsa.miR.17.5p + hsa.miR.375.3p
AUC_MDLSMOTE Class ~ hsa.miR.20b.5p + hsa.miR.19b.3p + hsa.let.7b.5p + hsa.miR.320b + hsa.miR.139.3p + hsa.miR.30d.5p + hsa.miR.17.5p + hsa.miR.182.5p + hsa.miR.421 + hsa.miR.375.3p
Performance of those signatures:
$AUC_MDLSMOTE
[1] "hsa.miR.421"
$`fwrap:AUC_MDLSMOTE`
[1] "hsa.miR.182.5p"
$`feseR_combineFS_RF_SMOTE:AUC_MDLSMOTE`
[1] "hsa.let.7b.5p" "hsa.miR.19b.3p" "hsa.miR.30d.5p" "hsa.miR.139.3p" "hsa.miR.17.5p" "hsa.miR.375.3p"
$`fwrap:feseR_combineFS_RF_SMOTE:AUC_MDLSMOTE`
[1] "hsa.miR.20b.5p" "hsa.miR.320b"
$fwrap
[1] "hsa.miR.1273h.3p"
This is by default performed for top 6 sets which achived the best accuracy in training, testing and validation.
For top 6 methods.
By default we choose the best performing set which achived the best mean accuracy in training, testing and validation.
Best set:
metaindex method
ElasticNet 0.8188778 ElasticNet
miRy
ElasticNet Class ~ hsa.let.7b.5p + hsa.miR.30d.5p + hsa.miR.320b + hsa.miR.19b.3p + hsa.miR.20b.5p + hsa.miR.1304.3p + hsa.miR.139.3p + hsa.miR.375.3p
This should serve as a sanity check.
miR log2FC p-value p-value BH
3 hsa.let.7b.5p -0.5614259 1.065803e-22 6.750086e-22
5 hsa.miR.30d.5p -0.5107887 2.985520e-16 1.134498e-15
4 hsa.miR.320b 0.7320847 6.053835e-17 2.875571e-16
2 hsa.miR.19b.3p -0.9074407 8.427626e-24 8.006244e-23
1 hsa.miR.20b.5p -1.2362520 4.093534e-25 7.777715e-24
17 hsa.miR.1304.3p 0.5522948 5.913518e-04 6.609226e-04
7 hsa.miR.139.3p 0.7180460 9.716613e-15 2.637366e-14
10 hsa.miR.375.3p 1.0764252 1.141904e-09 2.169618e-09
Based on benchmark results. You could achive better model by further tuning it. Metaindex - mean accuracy on training, testing and validation datasets. Metaindex2 - mean accuracy on testing and validation datasets only.
Name ID Modelling Method Selection Method Train ROC AUC Train Acc
153 C5.0+RandomForestRFESMOTE_sig 1661466163.44172 C5.0 RandomForestRFESMOTE_sig 1 1
134 C5.0+feseR_combineFS_RF_SMOTE 1661466123.98702 C5.0 feseR_combineFS_RF_SMOTE 0.992423055838358 0.971238938053097
148 C5.0+SU_MDLSMOTE 1661466152.72664 C5.0 SU_MDLSMOTE 1 1
120 rf+RandomForestRFESMOTE_sig 1661466089.46937 rf RandomForestRFESMOTE_sig 1 1
112 rf+sigtopSMOTE 1661466075.12167 rf sigtopSMOTE 1 1
145 C5.0+sigtopSMOTE 1661466145.20905 C5.0 sigtopSMOTE 0.999921685331663 0.995575221238938
131 rf+ElasticNet 1661466109.38676 rf ElasticNet 1 1
143 C5.0+sigtop 1661466140.78412 C5.0 sigtop 0.999892732636095 0.994884910485934
152 C5.0+RandomForestRFE_sig 1661466160.63635 C5.0 RandomForestRFE_sig 1 1
151 C5.0+RandomForestRFE 1661466157.97191 C5.0 RandomForestRFE 1 1
147 C5.0+AUC_MDLSMOTE 1661466149.76921 C5.0 AUC_MDLSMOTE 1 1
100 rf+feseR_combineFS_RF 1661466053.67859 rf feseR_combineFS_RF 1 1
119 rf+RandomForestRFE_sig 1661466087.65714 rf RandomForestRFE_sig 1 1
164 C5.0+ElasticNet 1661466183.65821 C5.0 ElasticNet 0.999570930544382 0.989769820971867
101 rf+feseR_combineFS_RF_SMOTE 1661466055.33465 rf feseR_combineFS_RF_SMOTE 1 1
146 C5.0+topFCSMOTE 1661466147.08534 C5.0 topFCSMOTE 0.999706319993735 0.988938053097345
108 rf+AUC_MDL 1661466067.82423 rf AUC_MDL 1 1
111 rf+topFC 1661466073.03387 rf topFC 1 1
129 rf+spFSR 1661466105.67747 rf spFSR 1 1
118 rf+RandomForestRFE 1661466085.79472 rf RandomForestRFE 1 1
Test Acc Valid Acc Metaindex
153 0.884615384615385 0.78030303030303 0.879252752690833
134 0.907692307692308 0.765151515151515 0.872539853809606
148 0.830769230769231 0.803030303030303 0.869820686860501
120 0.884615384615385 0.757575757575758 0.869455645161291
112 0.884615384615385 0.757575757575758 0.869455645161291
145 0.884615384615385 0.757575757575758 0.868337155321886
131 0.869230769230769 0.765151515151515 0.86771078841329
143 0.861538461538462 0.772727272727273 0.867058708758512
152 0.853846153846154 0.772727272727273 0.865728704694908
151 0.853846153846154 0.772727272727273 0.865728704694908
147 0.884615384615385 0.742424242424242 0.862720081653483
100 0.884615384615385 0.742424242424242 0.862720081653483
119 0.861538461538462 0.757575757575758 0.86189205828032
164 0.869230769230769 0.757575757575758 0.861876183829079
101 0.846153846153846 0.765151515151515 0.859907120743034
146 0.861538461538462 0.757575757575758 0.859131139911525
108 0.869230769230769 0.742424242424242 0.857784663051898
111 0.846153846153846 0.757575757575758 0.856697819314642
129 0.853846153846154 0.75 0.856041131105398
118 0.853846153846154 0.75 0.856041131105398
Metrics on training set:
Confusion Matrix and Statistics
Reference
Prediction Control Case
Control 165 0
Case 0 226
Accuracy : 1
95% CI : (0.9906, 1)
No Information Rate : 0.578
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 1
Mcnemar's Test P-Value : NA
Sensitivity : 1.000
Specificity : 1.000
Pos Pred Value : 1.000
Neg Pred Value : 1.000
Prevalence : 0.578
Detection Rate : 0.578
Detection Prevalence : 0.578
Balanced Accuracy : 1.000
'Positive' Class : Case
Model file: models/1661466163.44172.RDS
Call:
roc.formula(formula = train$Class ~ predtrain_y)
Data: predtrain_y in 165 controls (train$Class Control) < 226 cases (train$Class Case).
Area under the curve: 1
95% CI: 1-1 (DeLong)
Metrics on testing set:
Confusion Matrix and Statistics
Reference
Prediction Control Case
Control 49 9
Case 6 66
Accuracy : 0.8846
95% CI : (0.8168, 0.934)
No Information Rate : 0.5769
P-Value [Acc > NIR] : 1.721e-14
Kappa : 0.7653
Mcnemar's Test P-Value : 0.6056
Sensitivity : 0.8800
Specificity : 0.8909
Pos Pred Value : 0.9167
Neg Pred Value : 0.8448
Prevalence : 0.5769
Detection Rate : 0.5077
Detection Prevalence : 0.5538
Balanced Accuracy : 0.8855
'Positive' Class : Case
Metrics on validation set:
Confusion Matrix and Statistics
Reference
Prediction Control Case
Control 59 5
Case 24 44
Accuracy : 0.7803
95% CI : (0.7, 0.8477)
No Information Rate : 0.6288
P-Value [Acc > NIR] : 0.0001367
Kappa : 0.564
Mcnemar's Test P-Value : 0.0008302
Sensitivity : 0.8980
Specificity : 0.7108
Pos Pred Value : 0.6471
Neg Pred Value : 0.9219
Prevalence : 0.3712
Detection Rate : 0.3333
Detection Prevalence : 0.5152
Balanced Accuracy : 0.8044
'Positive' Class : Case
This is the end. Timestamp of the analysis:
[2022-08-26 00:32:05 | pid:19469] [OmicSelector: TASK COMPLETED]